TTEST_IND_STATS

Overview

The TTEST_IND_STATS function performs an independent samples t-test using summary statistics (mean, standard deviation, and sample size) rather than raw data. This is particularly useful when only descriptive statistics are available, such as when analyzing published research results or aggregated data.

The function tests the null hypothesis that two independent samples have identical population means. It wraps scipy.stats.ttest_ind_from_stats from the SciPy library. The source code is available on GitHub.

The t-statistic is calculated as:

t = \frac{\bar{x}_1 - \bar{x}_2}{SE}

where \bar{x}_1 and \bar{x}_2 are the sample means and SE is the standard error of the difference.

When equal_var is TRUE, the function performs Student’s t-test, which assumes equal population variances. The pooled standard error is computed using a weighted combination of both sample variances. For more details, see the Wikipedia article on the independent two-sample t-test.

When equal_var is FALSE, the function performs Welch’s t-test, which does not assume equal variances and is more robust when sample sizes or variances differ between groups. Welch’s test uses the Welch-Satterthwaite equation to approximate degrees of freedom. See the Welch’s t-test Wikipedia article for the full derivation.

The function supports three alternative hypotheses via the ttest_alt parameter: two-sided tests whether the means differ in either direction, less tests whether the first sample mean is less than the second, and greater tests whether the first sample mean is greater than the second.

Note that the std_one and std_two parameters expect the corrected sample standard deviation (computed with ddof=1), which is the standard output from most statistical software and Excel’s STDEV.S function.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=TTEST_IND_STATS(mean_one, std_one, nobs_one, mean_two, std_two, nobs_two, equal_var, ttest_alt)
  • mean_one (list[list], required): Mean(s) of sample 1. Each element represents a separate test.
  • std_one (list[list], required): Corrected sample standard deviation(s) of sample 1.
  • nobs_one (list[list], required): Number of observations in sample 1.
  • mean_two (list[list], required): Mean(s) of sample 2. Each element represents a separate test.
  • std_two (list[list], required): Corrected sample standard deviation(s) of sample 2.
  • nobs_two (list[list], required): Number of observations in sample 2.
  • equal_var (bool, optional, default: true): If TRUE, assumes equal population variances. If FALSE, performs Welch’s t-test.
  • ttest_alt (str, optional, default: “two-sided”): Defines the alternative hypothesis.

Returns (list[list]): 2D list [[statistic, p_value]], or error message string.

Examples

Example 1: Demo case 1

Inputs:

mean_one std_one nobs_one mean_two std_two nobs_two equal_var ttest_alt
1 1 10 1.5 1 10 true two-sided
2 1 10 2.5 1 10

Excel formula:

=TTEST_IND_STATS({1;2}, {1;1}, {10;10}, {1.5;2.5}, {1;1}, {10;10}, TRUE, "two-sided")

Expected output:

Result
-1.118 0.2783
-1.118 0.2783

Example 2: Demo case 2

Inputs:

mean_one std_one nobs_one mean_two std_two nobs_two equal_var ttest_alt
1 1 10 1.5 1 10 false less
2 1 10 2.5 1 10

Excel formula:

=TTEST_IND_STATS({1;2}, {1;1}, {10;10}, {1.5;2.5}, {1;1}, {10;10}, FALSE, "less")

Expected output:

Result
-1.118 0.1391
-1.118 0.1391

Example 3: Demo case 3

Inputs:

mean_one std_one nobs_one mean_two std_two nobs_two equal_var ttest_alt
2 1 10 1 1 10 true greater
3 1 10 2 1 10

Excel formula:

=TTEST_IND_STATS({2;3}, {1;1}, {10;10}, {1;2}, {1;1}, {10;10}, TRUE, "greater")

Expected output:

Result
2.2361 0.0191
2.2361 0.0191

Example 4: Demo case 4

Inputs:

mean_one std_one nobs_one mean_two std_two nobs_two equal_var ttest_alt
5 2 20 4 2 20 true two-sided
6 2 20 5 2 20

Excel formula:

=TTEST_IND_STATS({5;6}, {2;2}, {20;20}, {4;5}, {2;2}, {20;20}, TRUE, "two-sided")

Expected output:

Result
1.5811 0.1221
1.5811 0.1221

Python Code

import math
from scipy.stats import ttest_ind_from_stats as scipy_ttest_ind_from_stats

def ttest_ind_stats(mean_one, std_one, nobs_one, mean_two, std_two, nobs_two, equal_var=True, ttest_alt='two-sided'):
    """
    Perform a t-test for means of two independent samples using summary statistics.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind_from_stats.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        mean_one (list[list]): Mean(s) of sample 1. Each element represents a separate test.
        std_one (list[list]): Corrected sample standard deviation(s) of sample 1.
        nobs_one (list[list]): Number of observations in sample 1.
        mean_two (list[list]): Mean(s) of sample 2. Each element represents a separate test.
        std_two (list[list]): Corrected sample standard deviation(s) of sample 2.
        nobs_two (list[list]): Number of observations in sample 2.
        equal_var (bool, optional): If TRUE, assumes equal population variances. If FALSE, performs Welch's t-test. Default is True.
        ttest_alt (str, optional): Defines the alternative hypothesis. Valid options: Two-sided, Less, Greater. Default is 'two-sided'.

    Returns:
        list[list]: 2D list [[statistic, p_value]], or error message string.
    """
    def to2d(x):
        return [[x]] if not isinstance(x, list) else x

    def validate_2d_list(arr, name):
        if not isinstance(arr, list):
            return f"Invalid input: {name} must be a 2D list."
        if len(arr) == 0:
            return f"Invalid input: {name} must not be empty."
        for row in arr:
            if not isinstance(row, list) or len(row) < 1:
                return f"Invalid input: {name} must be a 2D list."
            for val in row:
                if not isinstance(val, (int, float)):
                    return f"Invalid input: {name} must contain numeric values."
        return None

    def flatten(arr):
        return [val for row in arr for val in row]

    mean_one = to2d(mean_one)
    std_one = to2d(std_one)
    nobs_one = to2d(nobs_one)
    mean_two = to2d(mean_two)
    std_two = to2d(std_two)
    nobs_two = to2d(nobs_two)

    for arr, name in [
        (mean_one, "mean_one"),
        (std_one, "std_one"),
        (nobs_one, "nobs_one"),
        (mean_two, "mean_two"),
        (std_two, "std_two"),
        (nobs_two, "nobs_two")
    ]:
        err = validate_2d_list(arr, name)
        if err:
            return err

    if not isinstance(equal_var, bool):
        return "Invalid input: equal_var must be a boolean."
    if ttest_alt not in ['two-sided', 'less', 'greater']:
        return "Invalid input: ttest_alt must be 'two-sided', 'less', or 'greater'."

    try:
        m1 = flatten(mean_one)
        s1 = flatten(std_one)
        n1 = flatten(nobs_one)
        m2 = flatten(mean_two)
        s2 = flatten(std_two)
        n2 = flatten(nobs_two)

        if not (len(m1) == len(s1) == len(n1) == len(m2) == len(s2) == len(n2)):
            return "Invalid input: all input arrays must have the same number of elements."

        results = []
        for i in range(len(m1)):
            res = scipy_ttest_ind_from_stats(
                mean1=m1[i], std1=s1[i], nobs1=n1[i],
                mean2=m2[i], std2=s2[i], nobs2=n2[i],
                equal_var=equal_var, alternative=ttest_alt
            )
            t_stat, p_val = res.statistic, res.pvalue

            if math.isnan(t_stat) or math.isinf(t_stat) or math.isnan(p_val) or math.isinf(p_val):
                return "Invalid result: t-statistic or p-value is nan or inf."

            results.append([float(t_stat), float(p_val)])
        return results
    except Exception as e:
        return f"scipy.stats.ttest_ind_from_stats error: {e}"

Online Calculator